expert teacher
Highlight Every Step: Knowledge Distillation via Collaborative Teaching
Zhao, Haoran, Sun, Xin, Dong, Junyu, Chen, Changrui, Dong, Zihe
--High storage and computational costs obstruct deep neural networks to be deployed on resource-constrained devices. Knowledge distillation aims to train a compact student network by transferring knowledge from a larger pre-trained teacher model. However, most existing methods on knowledge distillation ignore the valuable information among training process associated with training results. In this paper, we provide a new Collaborative T eaching Knowledge Distillation (CTKD) strategy which employs two special teachers. Specifically, one teacher trained from scratch (i.e., scratch teacher) assists the student step by step using its temporary outputs. It forces the student to approach the optimal path towards the final logits with high accuracy. The other pre-trained teacher (i.e., expert teacher) guides the student to focus on a critical region which is more useful for the task. The combination of the knowledge from two special teachers can significantly improve the performance of the student network in knowledge distillation. The results of experiments on CIF AR-10, CIF AR-100, SVHN and Tiny ImageNet datasets verify that the proposed knowledge distillation method is efficient and achieves state-of-the-art performance. ECENTL Y, deep neural networks achieved superior performance in a variety of applications such as computer vision [1][2][3][4] and natural language processing [5][6]. However, along with high-performance, the deep neural network's architecture becomes much deeper and wider which requires a high cost of computation and memory in inference. It is a great burden to deploy these models on edge-computing systems such as embedded devices and mobile-phones. Therefore, many methods [7][8][9][10][11] are proposed to reduce the deep neural network's computational complexity and high storage. Some lightweight networks like Inception [12], MobileNet [13], ShuffleNet [14], SqueezeNet [15] and Condense-Net [16] have been proposed to reduce the network size as much as possible under the condition of keeping a high recognition accuracy. All the above mentioned methods focus on physically reducing internal redundancy of the model to obtain a shallow and thin architecture.
Learning Visual Parkour from Generated Images
Yu, Alan, Yang, Ge, Choi, Ran, Ravan, Yajvan, Leonard, John, Isola, Phillip
Fast and accurate physics simulation is an essential component of robot learning, where robots can explore failure scenarios that are difficult to produce in the real world and learn from unlimited on-policy data. Yet, it remains challenging to incorporate RGB-color perception into the sim-to-real pipeline that matches the real world in its richness and realism. In this work, we train a robot dog in simulation for visual parkour. We propose a way to use generative models to synthesize diverse and physically accurate image sequences of the scene from the robot's ego-centric perspective. We present demonstrations of zero-shot transfer to the RGB-only observations of the real world on a robot equipped with a low-cost, off-the-shelf color camera. website visit https://lucidsim.github.io
Better than Your Teacher: LLM Agents that learn from Privileged AI Feedback
Choudhury, Sanjiban, Sodhi, Paloma
While large language models (LLMs) show impressive decision-making abilities, current methods lack a mechanism for automatic self-improvement from errors during task execution. We propose LEAP, an iterative fine-tuning framework that continually improves LLM agents using feedback from AI expert teachers. Our key insight is to equip the expert teachers with a privileged state -- information that is available during training but hidden at test time. This allows even weak experts to provide precise guidance, significantly improving the student agent's performance without access to privileged information at test time. We evaluate LEAP on diverse decision-making benchmarks, including text-based games (ALFWorld), web navigation (WebShop), and interactive coding (Intercode Bash). Our experiments show that LEAP (1) outperforms behavior cloning and ReAct baselines (2) enables weak student models (e.g., Llama3-8B) to exceed the performance of strong teacher models (GPT4-o), and (3) allows weak models to self-improve using privileged versions of themselves. We also provide a theoretical analysis showing that LEAP's success hinges on balancing privileged information with the student's realizability, which we empirically validate. Our code is available at https://leap-llm.github.io
"Mistakes Help Us Grow": Facilitating and Evaluating Growth Mindset Supportive Language in Classrooms
Handa, Kunal, Clapper, Margaret, Boyle, Jessica, Wang, Rose E, Yang, Diyi, Yeager, David S, Demszky, Dorottya
Teachers' growth mindset supportive language (GMSL)--rhetoric emphasizing that one's skills can be improved over time--has been shown to significantly reduce disparities in academic achievement and enhance students' learning outcomes. Although teachers espouse growth mindset principles, most find it difficult to adopt GMSL in their practice due the lack of effective coaching in this area. We explore whether large language models (LLMs) can provide automated, personalized coaching to support teachers' use of GMSL. We establish an effective coaching tool to reframe unsupportive utterances to GMSL by developing (i) a parallel dataset containing GMSL-trained teacher reframings of unsupportive statements with an accompanying annotation guide, (ii) a GMSL prompt framework to revise teachers' unsupportive language, and (iii) an evaluation framework grounded in psychological theory for evaluating GMSL with the help of students and teachers. We conduct a large-scale evaluation involving 174 teachers and 1,006 students, finding that both teachers and students perceive GMSL-trained teacher and model reframings as more effective in fostering a growth mindset and promoting challenge-seeking behavior, among other benefits. We also find that model-generated reframings outperform those from the GMSL-trained teachers. These results show promise for harnessing LLMs to provide automated GMSL feedback for teachers and, more broadly, LLMs' potentiality for supporting students' learning in the classroom. Our findings also demonstrate the benefit of large-scale human evaluations when applying LLMs in educational domains.
Can artificial intelligence replace teachers in near future? – AI.Business
Can artificial intelligence replace teachers in near future? Artificial intelligence applications are changing our lives. While a lot of examples show how AI is already being used, there is still a lot of room for innovation and new applications. Will artificial intelligence and machine learning replace teachers in near future? Automation has affected nearly every industry.
Teachers Should Embrace Artificial Intelligence in Education – MeriTalk
Automation has affected nearly every industry–47 percent of U.S. employees are at risk of computer automation, according to an Oxford University study–and teachers are no longer exempt. While automation in the classroom started with automated lights, it is now expanding to automating tasks with artificial intelligence (AI) machines. While some technophobes may paint a picture of robots replacing teachers in the near future, nearly all technology experts agree that this is highly unlikely. Rather, AI will just help teachers increase efficiency and improve their classroom management. The Clayton Christensen Institute, a nonprofit, nonpartisan think tank dedicated to improving the world through disruptive innovation, recently released a new report, "Teaching in the Machine Age: How Innovation Can Make Bad Teachers Good and Good Teachers Better."